Confidential · Internal Knowledge Agent

01 · The Challenge

Twelve years of knowledge — findable only if you already knew where.

Deal memos, IC decks, policies, and playbooks scattered across SharePoint, Notion, and shared drives. Every senior analyst's inbox was the search engine — and the bottleneck.

01 / 04

Tribal knowledge locked in 12+ years of files

Deal memos, IC decks, policies, and playbooks scattered across SharePoint, Notion, and shared drives. Findable only if you already knew where to look.

02 / 04

Senior analysts paged for the same questions weekly

The same precedent and policy questions cycled back to the same three or four people, every week — pulling them out of the work they were hired to do.

03 / 04

Strict data residency requirements

Nothing could leave the firm's environment. Public APIs were a non-starter. Whatever the answer was, it had to live entirely inside the VPC.

04 / 04

Onboarding measured in months

New hires spent weeks learning where knowledge lived before they could actually use it. Day-one productivity was a fantasy.

02 · The Solution

A single-tenant research agent — that cites every claim it makes.

Envyro partnered with the firm to design and deploy a single-tenant, entitlement-aware RAG agent — indexing 200,000+ internal documents and serving every team through Slack and the intranet.

Every answer carries citations back to the source document and page. Every retrieval respects the user's actual access rights. Nothing the user shouldn't see ever enters the prompt.

Built by Envyro · Running inside the firm's VPC.

Hybrid retrieval over 200K+ docs

BM25 + dense vectors over deal memos, IC decks, policies, and playbooks — twelve years of institutional knowledge made searchable.

Entitlement-aware

Every query is filtered through the user's access rights before retrieval. The model literally cannot reason over a document the user isn't entitled to see.

Slack-native + intranet widget

Zero onboarding, no new app to learn. The agent lives where the work already happens — DM it, mention it, ask it inline.

Citation-first answers

Every claim links back to its source document and page. No source means no answer — and the user can verify in one click.

03 · Deployment

One platform. Every team.

A single platform deployed inside the firm's VPC, rolled out team-by-team over six weeks. No public APIs, no third-party model exposure, no shared tenancy.

1 platform

Single-tenant, in-VPC

0

Public-API dependencies

24 / 7

Always-on

1

Step 01

Corpus + entitlements

Source connectors wired into SharePoint, Notion, and shared drives. The firm's access model mirrored exactly — no shortcuts.

2

Step 02

Index + tune

Hybrid retrieval indexed. Eval suite built against real internal questions. Citation behavior tuned until it stopped guessing.

3

Step 03

Live + observed

Rolled out team-by-team. Every query, retrieval, and feedback signal logged for tuning. The system gets sharper every week.

04 · Production Data

Where every query actually lands.

A representative month — roughly 7,200 queries across the firm, the vast majority answered cleanly with citations. The remainder routed to the right human SME with full context attached.

0

Queries / month

4.1 sec

Avg first token

4 surfaces

Slack · intranet · email · API

● Live

In production

Where every query lands

By volume

Representative month · ~7,200 queries · outcome distribution

~92% answered cleanly · ~8% routed with full context No hallucinated answers · every claim cited

05 · The Validation Gate

Ninety-two in a hundred answered cleanly. The rest get a warm handoff.

When the model isn't confident in a citation, it doesn't guess — it asks. The remaining 8% land in front of the right human SME with the question and partial context already attached.

Auto-answered

0%

Answered with grounded citations

Hybrid-retrieved, entitlement-filtered, and cited back to source — answer delivered in the surface the user asked from, in under five seconds.

Routed

0%

Routed to the right human SME

Surfaced to the SME with the original question, partial retrieval, and the model's hesitation reason — so the human picks up exactly where the agent stopped.

When the model isn't confident in a citation, it doesn't guess — it asks. That single decision is what makes the system safe to run firm-wide.

06 · How It Works

From question asked to cited answer returned.

A single pipeline carries every query through five stages — entitlements, retrieval, generation, citation, and feedback — in under five seconds, with full traceability at every step.

~4.1 sec

First-token latency, end to end — including entitlement resolution and retrieval.

Zero leakage

Entitlement filter runs before retrieval. The model cannot reason over docs the user can't access.

Full audit trail

Every query, retrieval, and answer logged — every citation traceable back to source.

Step 01

Question asked

Analyst pings the agent in Slack or the intranet widget. No new tool, no context-switch tax.

Step 02

Entitlements resolved

User's access rights loaded in real time. Retrieval scope is narrowed before search even runs.

Step 03

Hybrid retrieval

BM25 + dense search over the entitled subset of the corpus. Best of lexical and semantic, on the right slice.

Step 04

Grounded generation

LLM answers only from retrieved sources. No source means no answer — the system would rather say it doesn't know.

Step 05

Cited + logged

Answer returned with linked citations. Thumbs and corrections logged for ongoing tuning.

07 · Before / After

The same question — at a thousandth of the wait.

What a single research question used to mean for the firm, versus what it means now. The work shape is the same; the time-to-answer collapsed.

Before · per question

30 – 45 min

Hunt across SharePoint, Notion, and shared drives
Slack DM the partner who probably knows
Wait for a reply — sometimes the next day
Re-read three old IC decks to triangulate
Lose context on the actual task at hand
Answer only as good as the inbox you searched

After · per question

< 10 sec

Ask in Slack or the intranet widget
Entitlement-filtered retrieval runs instantly
Answer returned with linked citations
Source documents one click away
Feedback signal logged for the next query
Works at 11pm, on weekends, on day one

08 · The Impact

The firm runs the same — just faster, longer, and on the record.

Senior analyst hours come back. New hires get usable on day one. Knowledge stops leaving with people who leave. And nothing crosses the firm's boundary, ever.

i.

~6 hours per analyst per week, reclaimed

Returned to investment work, not file-hunting. Across the analyst bench, that's measurable IC throughput.

ii.

New-hire ramp cut from months to weeks

Day-one access to firm precedent and policy — without having to know which partner to ask first.

iii.

Institutional memory survives departures

Knowledge stays in the system, not in inboxes. When someone leaves, what they knew doesn't leave with them.

iv.

Zero data leaves the firm's environment

Single-tenant, in-VPC, audit-logged. The agent runs where the data already lives — no exceptions.

09 · Technology Stack

Single-tenant, entitlement-aware, and built for the firm's perimeter.

Trigger

Slack message · intranet widget query — per-question event ingestion.

Identity

Per-user entitlement resolver — access rights enforced before retrieval, not after.

Document Handling

Connectors for SharePoint, Notion, and shared drives — 200K+ docs indexed.

AI Layer

Hybrid BM25 + dense retrieval · private LLM gateway · grounded-only generation.

Surfaces

Slack-native bot · intranet widget · email digest · internal API.

Review Loop

SME routing for low-confidence answers · feedback signal captured per query.

Audit & Logging

Every query, retrieval, and answer logged · full access-rights audit trail.

10 · About Envyro

Production-grade AI agents — not demos.

Envyro is a specialized AI agency designing, deploying, and maintaining custom AI agents and pipelines that work in production. We stay on the call as your systems evolve.

SaaS · Collision Repair

Nexsyis

Shop management platform · AI email pipeline embedded into the stack.

Commercial · Maritimes

Office Interiors

Office equipment & service · bilingual voice AI for inbound calls.

Public Sector · Durham, NC

Durham County

350K+ residents · 24/7 GenAI resident support across municipal services.

Real Estate · NYSE

Veris Residential

$1.6B NYSE-listed REIT · resident-services AI across the portfolio.

How a global investment firm turned 12 years of deal memos into a 4-second answer.